See Chapter 2 for a refresher on mathematical notation and formulas, including how to interpret the

various forms of the summation symbol ∑ (the Greek capital sigma). In the rest of this chapter, we use

the simplest form, meaning the form without the i subscripts that refer to specific elements of an array,

whenever possible.

Some statistical books use the notation such that capital

and capital N refer to census

parameters, and lowercase versions of those to refer to sample statistics. In this book, we make it

clear each time we present this notation whether we are talking about a census or a sample.

Median

Like the mean, the median is a common measure of central tendency. In fact, it could be argued that the

median is the only one of the three that really takes the word central seriously.

The median of a sample is the middle value in the sorted (ordered) set of numbers. By

definition, half of the numbers are smaller than the median, and half are larger. The median of a

population frequency distribution function (like the curves shown in Figure 9-2) divides the total

area under the curve into two equal parts: Half of the area under the curve (AUC) lies to the left

of the median, and half lies to the right.

Consider the sample of diastolic blood pressure (DBP) measurements from seven study participants

from the preceding section. If you arrange the values in order from lowest to highest mmHg, you can

list them as 84, 84, 89, 91, 110, 114, and 116. There are seven values, and 91 is the fourth of the seven

sorted values, so that is the median. Three DBPs in the sample are smaller than 91 mmHg, and three

are larger than 91 mmHg. If you have an even number of values, the median is the average of the two

middle values. So imagine that you add a value of 118 mmHg to the top of your list, so you now have

eight values. To get the median, you would make an average of the fourth and fifth value, which would

be (91 + 110)/2 = 100.5 mmHg (don’t be thrown off by the 0.5).

Statisticians often say that they prefer the median to the mean because the median is much less strongly

influenced by extreme outliers than the mean. For example, if the largest value for DBP had been very

high — such as 150 mmHg instead of 116 mmHg — the mean would have jumped from 98.3 mmHg up

to 103.1 mmHg. But in the same case, the median would have remained unchanged at 91. Here’s an

even more extreme example: If a multibillionaire were to move into a certain state, the mean family

net worth in that state might rise by hundreds of dollars, but the median family net worth would

probably rise by only a few cents (if it were to rise at all). This is why you often hear the median

rather than mean income in reports comparing income across regions.

Mode